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Abstract 

We have developed an efficient computational scheme utilizing the real-space finite-difference 
formalism and the projector augmented-wave (PAW) method to perform precise first-principles 
electronic-structure simulations based on the density functional theory for systems containing tran- 
sition metals with a modest computational effort. By combining the advantages of the time-saving 
double-grid technique and the Fourier filtering procedure for the projectors of pseudopotentials, 
we can overcome the egg box effect in the computations even for first-row elements and transition 
metals, which is a problem of the real-space finite-difference formalism. In order to demonstrate 
the potential power in terms of precision and applicability of the present scheme, we have carried 
out simulations to examine several bulk properties and structural energy differences between dif- 
ferent bulk phases of transition metals, and have obtained excellent agreement with the results of 
other precise first-principles methods such as a plane wave based PAW method and an all-electron 
full-potential linearized augmented plane wave (FLAPW) method. 

PACS numbers: 31.15.xf, 02.70.Bf, 71.15.-m, 73.22.-f 
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I. INTRODUCTION 



Density functional theory [l| (DFT) in various approximations to the unknown exchange 
correlation potential developed to the most important practical scheme to investigate or 
determine the electronic, structural and many other properties of molecules, solids, and ma- 
terials in physics, chemistry, materials science, mineralogy and reaches now into new fields 
such biophysics and electro-chemistry. A typical trend observed is that we deal with in- 
creasingly more complex systems characterized by many atoms of different chemical nature 
in open structures and of low symmetry. This path is supported by the increasing availabil- 
ity of increasingly powerful computers spearheaded by high-performance computers. The 
developments of the latter show that we have to cope with massively parallel computer 
architectures dealing with thousands of cores. This path accelerates in the next 10 years 
with the attempt moving from peta-scale to exa-scale computing. Thus, developing elec- 
tronic structure methods whose applicability scales with the available processes becomes a 
prerequisite. 

The real-space scheme of first-principles calculations, in which all computations are im- 
plemented in real space, is a method that has the potential to scale with massively parallel 
architectures and has this potential without compromise on the precision and thus should 



y 



be superior to conventional plane wave methods 2]. Real-space methods can be loose , 
categorized as one of three types: finite differences^-^, finite elements^], or waveletsjl]. 
Chelikowsky et a/. (3, (4] have presented the real-space method combining the high-order 
finite-difference formula for the second derivative of the Kohn-Sham equation! 81 and the 

nn 

norm-conserving pseudopotential[9|, H0[ and demonstrated its applicability for the investiga- 
tion of the cohesive energy and bond length of diatomic molecules. There exist alternative 
high-order discretizations such as the Mehrstellen form used in the work of Briggs et a/. 
Several techniques to improve the precision and accelerate the computational speed have 
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been proposed so far[5|, Il2|-|l6l|. Further advantages of real-space finite-difference (RSFD) 
formalisms are that (i) the computational costs involved in calculating the projectors of pseu- 
dopotentials can be reduced when the calculations are implemented in real space, (ii) Since 
all of the calculations are carried out in real space, it is easy to incorporate Wannier-type 
orbitals, which are localized in a finite region required for the realization of linear scaling 
calculations 0, Q, into the algorithm, (iii) The grid spacing should be narrowed in order 



to improve the calculational precision, a procedure, which is simple and definite, and (iv) 
boundary conditions are not constrained to be periodic, e.g., combinations of periodic and 
nonperiodic boundary conditions for surfaces and wires, uneven boundary condition for tri- 
clinic systems, and twist boundary conditions for helical nanotubes are included straight 



forwardly 19J . In particular point (iv) is of significant advantage for electron-transport 



lylia. 

ions [5 



calculations |5|, |20j, because a nonperiodic boundary is indispensable for the direction in 
which electrons flow. 



Norm-conserving pseudopotentials, which were introduced by Hamann et al[9\, have 
made significant contributions to the description of the band structure of semiconductors 
and simple metals and the computation of their bond lengths, crystal structures and surface 
reconstructions as well as vibrational modes of molecules [21|. However, for systems with 
first-row elements or 3d electrons, the norm-conserving pseudopotentials are very hard so 
that a large plane-wave basis set or an extremely small grid spacing is required. Similarly, 
treating semicore states as valence states, which is often necessary for transition metals 
or compounds with light elements, e.g., GaN, results in hard pseudopotentials and affects 
their transferability. Some of these problems can be avoided employing Vanderbilt's ultra- 



soft pseudopotentials [221] . which relax the norm-conservation condition and are now adopted 
quite widely. Another alternatives is full-potential all-electron methods such as the Korringa- 
Kohn-Rostoker (KKR)-Greenfunction|23j or the full-potential linearized augmented plane- 
wave (FLAPW) method 24|, |25|. Both methods provide the Kohn-Sham answer of the given 
exchange correlation function taken for the problem at hand and provide in addition a pre- 
cise treatment of wave functions near the nuclei probed by several experimental techniques 
and can supply properties that are usually not provided by the conventional pseudopoten- 
tial approach. These are, among many others, the hyperfine parameters and electric field 
gradients, but also correct magnetic structures. In particular, for simulations related to 
spintronics, the correct description of magnetism the precise treatment of the localized d- 
levels is of crucial importance. B16chl||26] proposed a state-of-the-art all-electron method, 
called the projector augmented-wave (PAW) method, that retains the formal simplicity and 
practicability of the traditional pseudopotential approach, but matches the precision of the 
full-potential all-electron methods. 

Mortensen et al. {23] implemented the PAW method into their RSFD computational code, 
Grid-based Projector Augmented- Wave (GPAW) code, and demonstrated that the code, in 



terms of computational efficiency, is comparable to a plane wave based PAW (PWPAW) 
method by computing the bond length and atomization energy of small molecules consisting 
of first- and second-row elements. In addition, they claimed that the average difference of 
atomization energies of small molecules from the results obtained by other computational 
codes is 50 meV. However, the RSFD calculations have never been applied in simulations 
that require extremely high precision such as comparisons of small structural energy dif- 
ferences between different bulk phases of transition metals. For example, the difference of 
the cohesive energy between face-centered-cubic (fee) Cu and hexagonal close-packed (hep) 



Cu is just 8 meV/ atom according to the FLAPW calculation 24j, |25j using the local density 



approximation 
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and the required precision for the discussion of a pressure-induced phase 



transition for CuPt from Lli to B2 is less than 10 meV 

In the RSFD formalisms, there is a well-known problem that the total energies and forces 
depend unphysically on the position of the nucleus relative to the positions of grid points, 
which is called the "egg box effect." This problem is an obstacle to the precise computation 
of the total energies of systems containing transition metals with a moderate grid spacing. 
Although, reliable results can be achieved using a very small grid spacing in the RSFD for- 
malisms, one of the greatest benefits of the PAW method, the precise treatment of transition 
metals with modest computational effort in CPU time and computer memory, will be lost 
and instead the computational cost is expected to increase substantially. Furthermore, in 
the case of PAW method, the grid spacing required for transition metals is smaller than 
that for first-row elements, while this relation is opposite in the case of the norm- conserving 
pseudopotentials. Thus, the egg box effect is more severe for transition metals in the case 
of the PAW method and it is of great importance to develop methods to circumvent the 
egg box effect. Several prescriptions to deal with it have been proposed over time and their 
applications proved successful within the framework of norm-conserving pseudopotentials up 



to now 12Hl6l|. However, as far as we know, there are no reports on efficient techniques that 
allow us to perform extremely precise simulations such as examinations of small structural 
energy differences of transition metals with a moderate grid spacing. 

In this paper, we present an efficient computational scheme with a high degree of precision 
within the framework of RSFD formalisms making use of the PAW method to enable large- 
scale first-principles simulations for systems containing transition metals. By combining the 



advantage of the time-saving double-grid (DG) technique [12l. Il3| and Fourier filtering (FF) 



procedure for projectors of pseudopotentials|30|], we have succeeded to reduce the number 
of grid points employed in the calculations in the case of Cu 75% compared to our previous 
procedure 13j and the precision is improved. In order to demonstrate the performance of 
our scheme, we study the total-energy convergence with respect to the grid spacing and the 
energy variation due to the egg box effect. We also calculate the bulk properties of various 
3c? transition metals and Cu, as well as the structural energy differences between different 
bulk phases of those. We compare these results to present that this RSFD formalism is 
very precise and that the precision is comparable to those to in-house calculations using 



the VASP code 



3JJ, which bases on the PWPAW method, and the FLEUR code[25j, which 



uses the FLAPW method 



24i | . The advantage of the in-house comparison is that we can 



use the same exchange correlation potential and the same pseudopotential and can converge 
the properties in question individually to achieve an accurate comparison. This comparison 
shows that indeed the RSFD formalism in combination with the PAW method is very precise 
and is able to achieve full-potential all-electron precision. 

The rest of this paper is organized as follows: in Sec. II, we briefly introduce the trials 
for the egg box effect in the RSFD formalism and present our prescriptions to deal with this 
problem in detail. We introduce some examples to demonstrate the potential power of our 
scheme in Sec. III. Finally, in Sec. IV, we conclude with a discussion on the future direction 
of the RSFD electronic-structure calculations. 



II. METHODS 

A. Egg box effect 

In the RSFD formalism, real-space grid points are distributed across the computational 
region in which the atoms are distributed and wave functions, electronic charge density, and 
potentials are all represented on the discrete grid points. The egg box effect describes the 
phenomenon that the total energies and forces are affected unphysically by the positions of 
the grid points relative to the nucleus, although their discretizations are not invariant under 
uniform translations of the system with respect to the position of the grid. This problem 
occurs also even in plane wave formalisms when the operations concerning the projectors of 



pseudopotentials are implemented in real space 



30| . Although we can overcome the egg box 



effect by reducing the grid spacing, small grid spacings may require so many grid points as 
to result in a substantial increase in computational effort. It is known that the effect can be 
avoided by treating the projectors of pseudopotentials in Fourier space following the opera- 
tion in conventional plane wave formalisms. However, this procedure results in an increase of 
the computational costs from 0(mM) to 0(mMN) and the degradation of the performance 
on massively parallel computers, where m, M, and N are the numbers of atoms, occupied 
bands, and total grid points in the supercell, respectively. These procedures contradict an 
important demand in the simulations using PAW pseudopotentials, i.e., to perform very 
precise total energy calculations with a modest computational cost. Several approaches for 
this effect have been proposed in the framework of norm- conserving pseudopotentials so far 
and they are categorized into two approaches. One is to directly modify the behavior of the 
pseudopotentials, whieh vary sharply in the vicinity of nnclei, by filtering procedures. King- 
Smith et al. [30] were the first to introduce the careful treatment using the FF procedure 
for projectors of pseudopotentials in context of plane wave formalism, and several groups 
in quantum chemistry subsequently modified and/or simplified that filtering procedure by 
introducing mask functions 16]. These mask functions are very easy to introduce into the 
computational codes, because one does not need to change the codes extensively. However, 
the modification of pseudopotentials seriously destroys the transferability, particularly, in 
the procedures using mask functionsfl6|. The other approach is to reduce the grid spacing 
near the nuclei, and various attempts based on this approach have been madejs, Q-15], 
one of which is the CPU time-saving DG technique 12j, ll3[. It is used to execute the inte- 
grals concerning the pseudopotentials on the denser grids without introducing any artificial 
parameter or increasing the computational costs during the self-consistency iterations, and 
have achieved considerable success in stud ying atomic configurations, electronic structures 



and transport properties of nanostructures 



19 
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27|, |32|. Furthermore, the DG technique 



can be incorporated with the filtering procedures mentioned above. In the following sec- 
tions, we introduce the computational scheme combining the FF proposed by King-Smith 
et a/.|30] and the DG technique^, Q, [l3| to overcome the egg box effect in computations 



treating transition metals. 
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B. Separable nonlocal form of pseudopotential 



Computations concerning wave functions and nonlocal parts of pseudopotentials are usu- 
ally implemented in electronic-structure calculations using a separable form, e.g., using the 



procedure proposed by Kleinman and Bylander 33j. Following the notation of King-Smith 



et al. [30], the real-space representation of a separable form may be written as 

= EEE EffinVr, ^03(03(0 (1) 

e m=-l t 

where i and m are orbital and azimuthal angular momentum quantum numbers, respectively, 
Ytm is the spherical harmonic, C<?( r ) is a radial projection function that vanishes outside the 
cutoff region of pseudopotentials (r > r c ), E\ is an angular momentum dependent energy, 
and t is the index of the projectors. Note that more than one projector for the same Im are 
employed to improve the transferability of pseudopotentials in some cases. 

Both FF and DG exploit the fact, that the pseudo wave functions are considerably 
smoother than the projectors of the pseudopotential. If the inner products between wave 
functions and the projectors are calculated naively by evaluating the integrands on each grid 
point and summing up these values, the required grid spacing is determined by the shape 
of the projectors. But a significantly coarser grid is sufficient for the precise description of 
the smooth wave functions, and the inner product can be rewritten as a function of wave 
functions that needs to be evaluated only on that coarser grid. Such a procedure can remark- 
ably reduce the computational effort, as the operations involving the projectors but not the 
wave functions can be executed prior to the selfconsistency cycle. The remaining operations 
needed to determine the inner product are executed only for the coarse grid points. 



C. Fourier filtering procedure of pseudopotentials 

In the case of FF, the inner products between wave functions and the projectors are 
evaluated on a coarse grid after smoothening the projectors. This is done by transforming 
the projectors to Fourier space and removing the fastest- varying components as the following 
procedure. 
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Ci(r) is transformed to reciprocal space: 

"2 roc 



CM) = \ - / r 2 C(r)j e ( q r)dr, (2) 



o 



where ji is the spherical Bessel function of order i and I corresponds to the orbital angu- 
lar momentum of spherical harmonics. The modified pseudopotential functions are then 
transformed back to real space by Fourier transform: 

p2 flout 

X\{r) = Y - ^ <f~CM)3Mr)dq + Ax5(r), (3) 

where A%j(r) is an additional term such that xK r ) vanishes outside the sphere slightly larger 
than the cutoff region of pseudopotentials and q cut is the plane wave cutoff, which normally 
corresponds to n/H with H being a real-space grid spacing so as to be equal to that of 
the plane wave calculation that uses a Fourier transform grid with the same spacing as the 
RSFD calculation. We employ the original procedure proposed by King-Smith et a/.[30| to 
set up Axi{r), although many variations of the scheme have been proposed up to now 16]. 

Thus, the expectation value of the one-particle wave function if> with respect to the 
projectors of pseudopotential of atom s is 

i(;*(r)v s (r, r')ip(r')drdr' 
E E E^'/ r{r)p£{r s )dr f ^p^dr, 




m=-t t 



(4) 



where P^( r )[ = Yem(6r s , 0r s )x^ 4 ( rS )] are the projectors, r s = r — R s , and R s is the position 
of the atom s. Furthermore, this filtering procedure is also applicable to the local parts of 
pseudopotentials using the separation procedure of the local parts described in Refs. Q and 
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D. Double-grid technique 

In the DG technique, the inner products are evaluated on a dense grid by interpolating the 
pseudo wave functions ip(r). Since the interpolation scheme can be rewritten as a function 
of ip{r) (with ip(r) given on a coarse grid), the integrations are carried out on a coarse grid 
without degrading the numerical precision. 
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The DG employed here consists of two types of uniform and equi-distant grid points, i.e., 
coarse and dense ones, depicted in Fig. Ufa) by "o" and respectively. The dense-grid 
region enclosed by the circle is the core region of an atom that is taken to be large enough 
to contain the region in which the projectors do not vanish. We postulate here that pseudo 
wave functions are defined and updated only on coarse-grid points, while pseudopotentials 
are strictly given on all dense-grid points in an analytically or numerically exact manner. 

Let us consider inner products between pseudo wave functions ip{x) and the projectors 
Pe'm( x ) [ see Fig- Hfb)] in Eq. In the present scheme, pg^x) are the filtered projectors 
using the procedure in the preceding subsection while it is exactly the pseudopotential in the 
original DG technique. For simplicity, the illustration is limited to the one-dimensional case 
and an atom s exits at the origin, hereafter. Pseudopotentials are made to be finite at the 
original, and so the resulting pseudo wave functions are rather smoothly varying functions 
without nodes inside the cutoff region. On the contrary, the projectors, such as the p-states 
of first-row elements and the d-states of transition metals, are rapidly oscillating or rapidly 
varying functions. In this sense, pseudo wave functions are softer than pseudopotentials. 
In Fig. [T](b), the values of pseudo wave functions on coarse-grid points (o) are stored in 
computer memory, and the values on dense-grid points (•) are evaluated by interpolation 
from them. The well-known values of pseudopotentials both on coarse- and dense-grid points 
(o) are also shown schematically. One can see that only the values on coarse-grid points 
are so inadequate that the inner products cannot be precisely calculated; the errors are 
mainly due to the rapidly varying behavior of pseudopotentials. On the other hand, the 
inner products can be evaluated to great precision, if the number of dense-grid points is 
taken to be sufficiently large and also if the values of pseudo wave functions on dense-grid 
points are properly interpolated from those on coarse-grid points. 

Although there are many interpolation schemes, we introduce the nth Lagrange interpo- 
lation. The pseudo wave functions fa = faxi) on dense-grid points Xi are interpolated from 
\&j = ip(Xi) on coarse-grid points Xj as 

k 

= J2 ^J+KA K (xi), (5) 

K=-k+l 

where k = [k/2] + 1, J = [i/n], and Ak is the weight of the interpolation. In addition, 
[x] means the maximum integer not greater than x. The inner product is assumed to be 
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precisely approximated by the discrete sum over the dense-grid points, i.e., 

/d/2 nN+n-1 
^( x )pti( x ) dx ~ Y ^( x i)Pei( x i) h i ( 6 ) 
~ d / 2 i=-nN-n+l 

where d is the "diameter" of the core region, h is the dense-grid spacing, and n — 1 is the 
number of dense-grid points existing between adjacent coarse-grid points, i.e., n = H/h with 
H being the coarse-grid spacing, and 2N + l(2nN +1) is the number of coarse- (dense-) grid 
points in the core region. Since pseudopotentials are made to be finite, we postulate that 
Pe'm( x ) vanishes at \x\ > Xn + i(\x\ > x n N +n ). Now, substituting Eq. (jSJ) into the right-hand 
side of Eq. (jSJ) , we have 

rd/2 

^( x )p S ei( x )d x 

d/2 

nN+n-1 k 

Y Pei( x i) Y ^J+KA K (xi)h 

i=- n N-n+l K=-k+l 
N k n-1 

Y Y ^i+KY pS ^ Xni +^ AK ^ Xni +^ h 

I=-N-l K=-k+l m=0 
N+k 

Y (7) 

I=-N-k 



where 



nk 

s.t 



W im 



,I = ~Y P S ii( X nI+u)A^ [u/n] (x nI+u ). (8) 



n 

u=—nk 



As shown in Eq. (J?)), the right-hand side of the inner product Eq. has been replaced with 
the summation over coarse-grid points inside the core region, which produces only a modest 
overhead in the computational cost. 

In the PAW formalism, the local effective potential (sum of Coulomb and exchange- 
correlation potentials) is described on a grid that is two or three times denser than that 
for pseudo wave functions. This is a particular difference of the PAW method from the 
norm-conserving pseudopotentials. To solve the eigenvalue problem for the Kohn-Sham 
Hamiltonian, the local effective potential has to be described on the coarse grid points 
because pseudo wave functions are defined only on the coarse grid. This transformation 
from the dense grid to the coarse grid is achieved by the Fourier transform in the case of the 
plane wave method. On the other hand, in the RSFD scheme, the egg box effect is a serious 
problem if one simply picks up the values of the potential on the dense grid and uses them 
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in the eigenvalue problem. Assuming that the dense grid for the local effective potential of 
the PAW method corresponds to that in the DG technique, we can apply the DG technique 
to the transformation of the local effective potential as introduced in Refs. 5) and 13. In 
the energy functional, the inner product of the local effective potential v e ^{x) and pseudo 



charge density p(x), which is defined as n(r) in Ref. 



26 



is given by 



/ 



nN a ii+n —l 

p(x)v ef] \x)dx « 2_j p{ x i)v eff { x i)h, (9) 

i=- n N all -n+l 



where Q and 2N a u + 1 are the volume of the computational region and the number of coarse 
grid points in the computational region, respectively. The pseudo charge density p^ = n(xi) 
on dense-grid points Xj is interpolated from Pj = p{Xj) on coarse-grid points Xj as 

k 

p(xi)= Pj+KA K (xi). (10) 

K=-k+l 

By substituting Eq. (fit)]) into the right-hand side of Eq. (jUJ) and taking into account that 
the smooth pseudo charge density vanishes outside the computational region in the case of 
an isolated boundary condition, we have 



/ p(x)v e ff (x)dx 
Jn 

nN aU +n-l k 

v eff &) Yl pj+KA K ( Xl )h 

i = - n N all -n+l K=-k+l 
N a ii k rt-1 

= Pi +K ^2v eff (x nI+tl )A K (x nI+IJi )h 

I=-N aU -l K=-k+l fi=0 
N a „ 

= Pi< ff H, (11) 

I=-N„,i 



where 



j nk 

W] ff = ~ Y Veff { X nI+v)A- [y/n ]{x nI+v ). (12) 



n 

u=—nk 



Calculating the derivative with respect to the pseudo wave functions we find that w e /^ is 
the contribution of the local effective potential in the eigenvalue problem for the Kohn-Sham 
Hamiltonian. The extension to a periodic boundary conditions is straightforward. 
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III. PERFORMANCE RESULTS 



We now examine the efficiency of the present combination of the FF and the DG tech- 
nique. Hereafter, we choose the seventeen-point finite-difference formula, i.e., Nf = 8 in 
Eq. (1) of Ref. \% for the differentiation of the wave function. The dense-grid spacing is 
set as hfj, = where (// = x, y, and z) is the coarse-grid spacing in the \i direc- 

tion. The seventeenth-order Lagrangian interpolation is used for the interpolation of the 



DG technique. In order to demonstrate the precision of the RS 
are compared to those obtained by PWPAW|3l| and FLAPW 



'D calculations, our results 



24 



25] calculations. In all 



calculations throughout the paper we used the Vosko, Wilk and Nussair[28| approximation 
to treat the unknown exchange correlation functional within the framework of the density- 
functional theory [lj. The parameters for the pseudopotential generation and the PAW data 
sets for the RSFD calculation are summarized in Table [fl which are taken from another 
first-principles code based on the plane wave formalism [34J. All cut-off parameters for the 
RSFD and FLAPW calculations are given with the respective data presented and the data 



obtained by the PWPAW method are taken from Ref. 



31 



A. Total energy convergence 

The total energy convergence with respect to grid spacing is an important test for the 
RSFD formalism utilizing the present combination of the FF and the DG technique because 
computations with large grid spacing are one of the advantages of the PAW pseudopotentials 
over norm-conserving ones and grid spacing is closely relevant to computational cost. An 
isolated Cu atom is selected as an example in this test. The Cu atom is placed in the center 
of the neighboring grid points for the x, y and z directions. The total energy convergence as 
a function of the cutoff energy is depicted in Fig. [2j In accordance with Ref. [l4j, we defined 
a cutoff energy, tt 2 /H 2 (Ry), that is equivalent to that of the plane wave formalism which 
uses a fast-Fourier-transform grid with the same spacing as the present calculation. When 
neither the FF nor the DG is used, the total energy does not converge even when the cutoff 
energy increases to 60 Ry. Although we can also obtain good convergence when either the 
FF or the DG is adopted, one can see that the use of the combination of the FF and the DG 
yields the best convergence among them; convergence to 1 meV/atom is achieved at about 
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55 Ry. 



B. Egg box effect 

To illustrate the efficiency of the present scheme for the egg box effect, a test calculation 
is performed on the total energy variation with respect to grid points. In the calculation 
using a discrete grid, the loss of translational invariance manifests spurious variation of the 
total energies and forces, which prevents us from implementing practical calculations. Fig. [3] 
shows the difference in the total energy between the cases in which the atom is placed in 
the center of the neighboring grid and the atom is shifted by 0.5 H x along the 
function of the cutoff energy. The loss of translational invariance in the present combination 
of the FF and the DG technique is the smallest and the difference at about 45 Ry is as 
small as 1 meV/atom. In our previous studyH, the interatomic distance of Cu dimer were 
examined using the norm-conserving pseudopotentials. The grid spacing was 0.265 bohr 
and the deviation of the total energy due to the egg box effect was ~4 meV/atom. From 
the present results concerning the total energy convergence and the prescription for the egg 
box effect, a grid spacing of 0.42 bohr is sufficient for the precise treatments of the system 
including Cu in the present scheme, which means that we have succeeded to reduce the 
number of grid points 75% and improve the precision compared to our previous scheme. 

C. Consistency between total energy and force 

Consistency between total energy and force is of importance for molecular-dynamics 
simulations. We then compute the force acting on atoms to check the consistency. Figure H] 
shows the computational model: The supercell, which contains 5 Cu atoms consisting of one 
adatom and two rigid Cu(001) planes, is L x =L y =a and L z =4a under periodic boundary 
condition, where L x , L y , and L z are the lengths of the supercell in the x, y, and z directions, 
respectively, and ao is the experimental lattice constant of the Cu bulk (6.68 bohr). A grid 
spacing of 0.42 bohr, which corresponds to the plane wave cutoff energy of 57 Ry, is used 
and 8x8x1 fc-point mesh is employed for the integration in the Brillouin zone. The position 
of the adatom is displaced by 0.05 bohr along the z axis. The total energy and force as a 
function of the height of the adatom from the surface are plotted in Fig. [5j The numerical 
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derivation of the total energy is computed using the seven-points finite-difference formula; 
the error due to the numerical derivation is ~ 10~ 8 mHartree in this case. The maximum 
difference between the computed force and the numerical derivation is 0.17 mH/bohr. Note 
that the consistency between total energy and force is excellent and the precision of the 
force is enough to implement first-principles molecular-dynamics simulations. 



D. Migration energy 

To demonstrate the precision of force in the practical calculations, the energy barrier for 
a Cu adatom to hop along the [110] direction on the Cu(001) surface is examined using 
the same surface with that of Sec. IIII CI We evaluate the ground-state geometry for the 
adatom located at a hollow site A, and then displace the adatom along the [110] direction 
from the fee hollow site A to the nearest hollow site via the bridge site B (see Fig. [ 



For comparison, we compute the energy barrier using FLAPW 



24 



25| and plane wave [35] 



calculations. The plane wave method employs the ultrasoft pseudopotentials proposed by 



Vanderbilt 22] . Same fc-point set and supercell are adopted in all three calculations. The 
optimized height of the adatom from the surface layer and computed migration energy 
barrier are shown in Table HT1 The numerical errors are found to be 0.2% and 0.3% for the 
height of the adatom and migration energy, respectively, and those are negligible in practical 
simulations. 



E. Bulk properties 

Our final series of tests is the calculation of the bulk properties of transition metals. 
These systems are usually treated with the plane- wave basis set. In order to ensure the 
precision of the RSFD formalism with the present combination of the FF and the DG, the 
computed bulk properties are compared with other theoretical results. The cuboid supercells 
are employed and the /s-space integrations are performed with 15 x 15 x 15 k, 12 x 12 x 
12 k, and 14 x 8 x 8 k points, yielding 240, 240, and 185 k points in the irreducible wedge 
of the Brillouin zone for the body-centered-cubic (bec), fee, and hep structures, respectively. 
Table [TTT1 clearly shows that our results are in excellent agreement with those obtained using 
the PWPAW and FLAPW codes. The parameters in the PAW pseudopotentials affect the 
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results slightly, therefore the difference in the results between the three numerical methods 
are mainly attributed by the PAW pseudopotentials. 

We next compute the structural energy differences between different bulk phases of tran- 
sition metals. This calculation requires an extremely high precision because the differences 
are quite small. The cutoff energies and the number of sampling fc-points are the same as 
those listed in Table IIHI As can be seen from Table IIVI the agreement between the current 
RSFD code implemented in the present scheme, the PWPAW code and the FLAPW code 
is excellent. With the exception of Fe all ground state structures are well reproduced with 
deviations in the order of 10% or better. This is really an achievement considering the 
difference of the methods. Even the fee structure of Cu is well reproduced which is only 
about 8 meV lower in energy than the hep structure. Fe deserves a special mentioning. 
Our calculations do not reproduce the bec ground-state structure, a well-known failure of 
the local-spin-density approximation 28[ and cannot be not attributed to the computational 
methods. In fact the results obtained by the different methods are internally consistent. 
These results imply that the RSFD formalisms with the combination of the FF and the DG 
technique are readily applicable to simulations treating transition metals with a high degree 
of precision. 



IV. SUMMARY AND CONCLUSION 

We developed a first-principles electronic structure method that solves the projector aug- 
mented wave (PAW) formalism by a real-space finite-difference (RSFD) approach which 
exhibits full-potential all-electron precision with a moderate grid spacing even for transition 
metals. The scheme developed combines the advantage of the time-saving double-grid tech- 
nique with the capability of the Fourier-filtering procedure and was benchmarked agains the 
full-potential linearized augmented plane wave (FLAPW) method. The FLAPW method is 
widely considered the most accurate first-principles method for solids, which essentially pro- 
vides the Kohn-Sham answer to the chosen functional approximating the unknown exchange 
correlation energy. We have shown that the present RSFD method is a band-structure 
method that gives ground state properties, such as lattice parameters, bulk moduli, and 
magnetic moments with the same accuracy as the FLAPW method, If identical exchange 
correlation functionals are used in the calculations, the differences between the RSFD and 
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the FLAPW codes results are as small as ~ 0.01 bohr, ~ 0.05 Mbar, and ~ 0.03 /j,b for 
lattice parameters, bulk modulii, and magnetic moments, respectively. Although the cut- 
off energy required in the RSFD calculations is almost twice as large as that employed in 
the plane wave formalisms, this disadvantage can be compensated by the advantages of 
the RSFD formalism, e.g., excellent performance on massively parallel computers, small 
computational cost for the operations concerning the projectors of pseudopotentials, and 
suitable algorithms for linear scaling computation. The development of a simulation code 
combining the RSFD formalism and the PAW method dedicated to performing large-scale 
first-principles calculations on massively parallel computers is in progress. 
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FIG. 1. (a) DG adopted in the text. The "o" and "•" correspond to coarse- and dense-grid 
points, respectively. The circle shows the core region of an atom that is taken to be sufficiently 
large to contain the region in which the projectors are non-zero, (b) Wave function ip(x) and 
pseudopotential projector on coarse- and dense-grid points in the one-dimensional case. Xj 

(xj) represents a coarse- (dense-) grid point with i = nl + fj, (0 < /J, < n), and hence Xj = x n j. 
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FIG. 2. (color online) Convergence of total energy for Cu atom as a function of the coarse-grid 
cutoff energy ir /H 2 (Ry). Zero total energy is set at that computed with cutoff energy of 67 Ry. 



21 




FIG. 3. (color online) Total energy variation E — E' of Cu atom owing to egg box effect as a 
function of the coarse-grid cutoff energy tt 2 /H 2 (Ry). Here, E is the total energy when the atom 
is located at the center between adjacent coarse-grid points and E 1 is the energy when the atom is 
shifted by 0.5H X along the x direction. 
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FIG. 4. (color online) Top view of Cu(001) surface-layer-atoms, second-layer atoms and adatom 
(crosses) for the check of consistency between total energy and force discussed in the text. The 
test system is a cell of 5 Cu atoms consisting of one adatom and two rigid Cu(001) planes. This 
model is also employed to compute the energy barrier for the surface migration in Sec. IIIIDI 
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FIG. 5. Energy and force for an adatom on Cu(OOl) surface at different heights. Top: The circles 
show the calculated energies and the curve shows a spline fit. Bottom: The circles show the 
calculated forces and the curve is minus the derivative of the energies in the top panel using the 
seven-points finite-difference formula. The zero of the height is set at the most stable position. The 
edges of the arrows in the lower panel correspond to the coarse grid planes parallel to the Cu(OOl) 
surface. 
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TABLES 
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TABLE I. Parameters of PAW data sets used in the present work. r c , q cu t, and e\ (l=s, p, and 
d, and t=l and 2) are the cutoff radius, the filtering parameter in Eq. ([2]), and the eigenvalue of 
the partial waves, respectively. Rq and 7 are the filtering parameters defined in Ref. 



30|. e\ in 



the first and second lines is the reference energy for the first (t=l) and second (t=2) projectors, 
respectively, and ej corresponds to the eigenvalue of the Kohn-Sham Hamiltonian. 

r c (bohr) r gL(Ry) 7 2 (Ry) 4(Ry) 4 (%) 4 ( R y) 

Cu 2.20 1.2r c 25 100 -0.358 -0.058 -0.392 

1.642 0.208 

Fe 2.10 1.2r c 25 100 -0.402 -0.106 -0.570 

0.030 

Ni 2.15 1.2r c 25 100 -0.430 -0.098 -0.672 

0.228 

Co 2.10 1.2r c 25 100 -0.416 -0.102 -0.622 

-0.022 

Ti 2.25 1.2r c 25 100 -0.338 -0.113 -0.328 

0.072 
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TABLE II. Migration energy and height from surface layer of adatom on Cu(OOl). AZa and AZb 
are the height of the adatom at the positions A and B in Fig. [H respectively. Plane wave cutoff 
energy of 25.0 Ry and muffin-tin radius of 2.1 bohr are employed in the FLAPW calculations 25]. 



Plane wave cutoff energy of 36.0 Ry and cutoff radius of 2.2 bohr for Vanderbilt's ultrasoft pseu 



dopotentials are employed in the plane wave calculations 



RSFD 
FLAPW 

Plane wave method 



Migration energy (meV) 
661 
673 
672 



AZ A (bohr) 
2.97 
2.97 
2.96 



AZ B (bohr) 
3.61 
3.61 
3.60 
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TABLE III. Equilibrium lattice constant (a and c), bulk modulus (B), and magnetic moment 
(M ). The grid spacings (the cutoff energies) of the RSFD code are 0.43, 0.41, 0.42, 0.42, and 
0.40 bohr (52, 60, 57, 57, and 62 Ry) for Fe, Ni, Co, Cu, and Ti, respectively. In the FLAPW 



calculations 25f| , we use plane wave cutoff energies of 25.0 Ry (Fe, Ni, and Cu) and 14.4 Ry (Co 
and Ti), and muffin-tin radii of 2.1—2.2 bohr (Fe, Ni, Co, and Cu) and 2.6 bohr (Ti). 

Ref. a (bohr) c/a B(Mbar) M {jx B ) 

bcc Fe 

RSFD 5.21 2.53 1.97 

PWPAW 31 5.20 2.47 2.00 

FLAPW 25 5.20 2.54 2.00 
fee Ni 

RSFD 6.48 2.63 0.58 

PWPAW 31 6.48 2.51 0.58 

FLAPW 25 6.47 2.63 0.59 
hep Co 

RSFD 4.59 1.61 1.51 

PWPAW 31 4.59 1.62 1.51 

FLAPW 25 4.59 1.61 1.52 
fee Cu 

RSFD 6.65 1.85 0.00 

FLAPW 25 6.65 1.90 0.00 
hep Ti 

RSFD 5.42 1.58 0.00 

FLAPW 25 5.42 1.58 0.00 
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TABLE IV. Structural energy differences between different bulk phases at the equilibrium lattice 
constants. The most stable phase is chosen as zero energy. Unit is meV/atom. For hep Fe, c/a 
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. Most computational parameters as in 



is set at ideal value to keep the consistency with Ref. 
Table HU but in the FLAPW calculations we increased the cutoff energies to 27.0 Ry and 25.0 Ry 
for Co and Ti, respectively. 



Ref. 



hep 



fee 



NM bec 



FM bec 



Fc 



Co 



Cu 



Ti 



RSFD 

PWPAW 

FLAPW 

RSFD 
FLAPW 

i 

RSFD 
FLAPW 

RSFD 
FLAPW 
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25 



25 



-142 
-139 
-142 






25 



+8 






-72 
-68 



+20 
+23 




+53 
+54 



+288 
+282 
+287 
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